Genomic Analysis of Halotolerant Bacterial Strains Martelella soudanensis NC18T and NC20

Two novel, halotolerant strains of Martelella soudanensis, NC18T and NC20, were isolated from deep subsurface sediment, deeply sequenced, and comparatively analyzed with related strains. Based on a phylogenetic analysis using 16S rRNA gene sequences, the two strains grouped with members of the genus Martelella. Here, we sequenced the complete genomes of NC18T and NC20 to understand the mechanisms of their halotolerance. The genome sizes and G+C content of the strains were 6.1 Mb and 61.8 mol%, respectively. Moreover, NC18T and NC20 were predicted to contain 5,849 and 5,830 genes, and 5,502 and 5,585 protein-coding genes, respectively. Both strains contain the identically predicted 6 rRNAs and 48 tRNAs. The harboring of halotolerant-associated genes revealed that strains NC18T and NC20 might tolerate high salinity through the accumulation of potassium ions in a “salt-in” strategy induced by K+ uptake protein (kup) and the K+ transport system (trkAH and kdpFABC). These two strains also use the ectoine transport system (dctPQM), the glycine betaine transport system (proVWX), and glycine betaine uptake protein (opu) to accumulate “compatible solutes,” such as ectoine and glycine betaine, to protect cells from salt stress. This study reveals the halotolerance mechanism of strains NC18T and NC20 in high salt environments and suggests potential applications for these halotolerant and halophilic strains in environmental biotechnology.


Phylogenetic and Phylogenomic Analysis
The 16S rRNA gene sequence similarity of the two strains and closely related taxa was compared using the EzBioCloud server (www.ezbiocloud.net) [30]. The 16S rRNA gene sequences were aligned using the CLUSTAL X software program [31], and gaps were edited in the BioEdit program [32]. The phylogenetic trees were constructed using the MEGA 6.0 software with neighbor-joining, maximum-likelihood, and maximumparsimony methods [33]. Statistical reliability was assessed from 1,000 bootstrap replicates. The G+C content of the genomic DNA was determined from each genome sequence. The average nucleotide identity (ANI) and in silico DNA-DNA hybridization (DDH) values were calculated by using the EZGenome web service (www.ezbiocloud.net/tools/ani) and Genome-to-Genome Distance Calculator (http://ggdc.dsmz.de/ggdc.php), respectively [34,35]. Gene clusters were analyzed with publicly available genomic sequences of Martelella mediterranea DSM 17316 T (GCF_002043005), Martelella endophytica YC6887 T (GCF_000960975), and Martelella lutilitoris GH2-6 T (GCF_005924265) using Mauve (version 20150226) to understand the salt tolerance mechanism [36].

Availability of Data and Materials
Strains NC18 T and NC20 were deposited in the Korean Collection for Type Culture (KCTC) and NITE Biological Resource Center (NBRC) under the deposit numbers KCTC 82174 T =NBRC 114661 T and KCTC 82175=NBRC 114662, respectively. The 16S rRNA gene sequences of strains NC18 T and NC20 were deposited in GenBank/EMBL/DDBJ under accession numbers MT367774 and MT367775, respectively. The genomic sequences of strains NC18 T and NC20 were deposited at DDBJ/ENA/GenBank under accession numbers CP054858-CP054860 and CP054861-CP054863, respectively.

General Genomic Characteristics and Annotation
The general genomic features of M. soudanensis NC18 T and NC20 are listed in Table 1. The complete genome sequence of strain NC18 T comprised a circular chromosome of 6,109,459 bp containing 5,531 functional CDSs, 292 pseudogenes, 6 rRNAs, and 48 tRNAs with an average G+C content of 61.8%. Additionally, the genome of strain NC20 comprised 6,109,677 bp and had a G+C content of 61.8%. The genome contained 5,467 functional CDSs, 275 pseudogenes, 6 rRNA, and 48 tRNA genes (Fig. S2).

Functional Categorization
The functionally encoded genetic features in M. soudanensis NC18 T and NC20 were categorized according to the KEGG, COG, and SEED databases. The 5,823 CDSs for M. soudanensis NC18 T and 5,742 CDSs for M. soudanensis NC20 were assigned to 4,424 and 4,413 KEGG identifiers (Table S1), 3,984 and 3,937 COG identifiers (except for function unknown) (Table S2), and 2,324 and 2,248 SEED identifiers (Table S3), respectively. In the KEGG analysis, carbohydrate and amino acid metabolism were mainly abundant, indicating that M. soudanensis is capable of using a variety of carbon sources ( Fig. 2A, Table S1) [16]. In particular, genes related to membrane transport and signal transduction were ranked high, suggesting a higher tolerance level of Martelella under harsh salt conditions. In the COG analysis, the genes for amino acid transport and metabolism (E), carbohydrate transport and metabolism (G), inorganic ion transport and metabolism (P), and transcription (K) were highly identified except for function unknown (S) (Fig. 2B, Table S2). These categories are closely linked to the nutrients obtained from various environments and the maintenance of survival [39]. Function unknown (S) accounted for a large portion, indicating the current lack of understanding of M. soudanensis genomes.
In the SEED analysis, the most abundant functions were associated with the amino acid and derivatives,  carbohydrate, protein metabolism, cofactors, vitamins, prosthetic groups, pigments, and membrane transport subsystems (Fig. 2C, Table S3). Overall, the functional gene categories in the KEGG, COG, and SEED profiles for M. soudanensis NC18 T and NC20 were classified similarly.

Salt Tolerance of M. soudanensis NC18 T and NC20
Osmotic adaptation is essential for bacterial survival in a high salt environment. If the osmotic pressure of the environment is higher than that of the cells, water outflow occurs, resulting in dehydration. Thus, cells maintain homeostasis by reaching an osmotic balance through a process called osmotic regulation. As a primary response, preservation of cell osmotic pressure involves water efflux [40] and accumulation of potassium (K + ) for water retention [41].
After this primary response, osmoprotectants that are more efficient than K + , such as glycine betaine and ectoine, accumulate. These osmoprotectants, as compatible solutes, are either biosynthesized or salvaged from the environment [42][43][44]. The annotation results of the M. soudanensis NC18 T and NC20 genomes revealed that some homologous proteins related to halotolerance-associated genes showed two main strategies: "salt-in" and "compatible solute" (Table S4 and Table S5) [11]. M. soudanensis NC18 T and NC20 use the ectoine transport system (DctPQM), the glycine betaine transport system (ProVWX), ectoine transporter (YiaN), and glycine betaine or/and choline selective transporters (OpuB, OpuC, OpuD, and TC.BCT) to accumulate "compatible solutes", such as ectoine and glycine betaine, to protect cells from salt stress (Fig. 3) [45][46][47][48][49]. After uptake, choline is converted into glycine betaine by a family of oxidoreductases, such as BetA, BetB, and CMO [41,50,51]. In addition, L-ectoine synthase (EctC), a key enzyme in the production of ectoine, was also identified [52]. Another compatible solute, Nε-acetyl-ß-lysine, is unique to methanogenic archaea and protects the cell walls against salt stress. A gene potentially encoding lysine-2,3-aminomutase (KamA), which is assumed to catalyze Nε-acetyl-ßlysine formation from alpha-lysine, which is commonly found in methanogenic archaea, has also been identified. Previous studies have suggested that horizontal gene transfer may occur within bacteria and methanogenic archaea by comparing the phylogenetic relationships between lysine 2,3-aminomutase-coding genes and 16S RNA genes [53]. The oxidoreductase PepQ was presumed to protect against damage caused by increasing salt concentrations in cells [54]. These two strains also have a K + uptake system (TrkAH), K + uptake protein (Kup), and K + transport system (KdpFABC) used in the "salt-in" strategy, which can perform one-way transport of K + into the cytoplasm and maintain osmotic pressure to increase salt resistance (Fig. 3) [41,55,56]. The compatible solutes also have protection, stabilization and catalysis functions, which make them useful for industrial applications, such as cosmetics, health care, and biotechnology [54].

Comparative Analyses of Halotolerant-Associated Gene Clusters
The halotolerant-associated gene clusters present in the genomes of M. soudanensis NC18 T , M. soudanensis NC20, M. mediterranea DSM 17316 T , M. endophytica YC6887 T , and M. lutilitoris GH2-6 T were compared using Mauve (Fig. 4, Table 2). A comparison of the gene clusters involved in the salt-in strategy shows that all five strains had trkA and trkH genes encoding K + uptake proteins while M. soudanensis NC18 T , M. soudanensis NC20, and M. mediterranea DSM 17316 T contained additional kup genes encoding K + uptake proteins. Moreover,    NC18 T and M. soudanensis NC20 have more genes involved in K + uptake and transport for the salt-in strategy and additional genes involved in ectoine transport and synthesis for the compatible solute strategy. Consequently, M. soudanensis NC18 T and M. soudanensis NC20 have more diverse halotolerant-associated gene clusters that support the maintenance of a normal metabolic capacity under high salinity conditions. With the metabolic diversity, low nutritional requirements, and genetic mechanisms of adaptation to harsh conditions such as high ionic strength, halophiles are considered potential unique natural sources for the discovery of bioactive compounds and compatible solutes including novel and/or extraordinary enzymes [57,58]. These biomolecules are valuable and show commercial potential in the food, pharmaceutical, biomedical, industrial, and environmental fields [59,60,61]. Therefore, our results should provide new insights into the halotolerance mechanism of halotolerant and halophilic microbes and their potential applications in environmental biotechnologies.